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Abstract 

The ambient particulate chemical composition data with 
three particle diameter sizes (2.5mm < Dp < 1.15 mm, 
1.15mm < Dp < 0.34mm and 0.34mm < Dp < 0.1mm) 
collected at a major industrial center in Allen Park in De- 
troit, MI is examined. Standard multiway (tensor) meth- 
ods like PARAFAC and Tucker tensor decompositions have 
been applied extensively to many chemical data. However, 
for multiple particle sizes, the source apportionment analy- 
sis calls for a novel multiway factor analysis. We apply the 
regularized block tensor decomposition to the collected air 
sample data. In particular, we use the Block Term Decompo- 
sition (BTD) in rank-(L; L; 1) form to identify nine pollution 
sources (Fe+Zn, Sulfur with Dust, Road Dust, two types of 
Metal Works, Road Salt, Local Sulfate, and Homogeneous 
and Cloud Sulfate). 

1 Introduction 

Source apportionment analysis determines the sources 
of various air pollutants at a particular location. 
Through a collected sample of environmental data, the 
ambient particulate chemical composition data is ac- 
quired and then analyzed. This technique of identifying 
and quantifying the sources of air pollutants at a recep- 
tor location by resolving the mixture of chemical species 
into the contributions from the individual source types 
is called receptor modeling when factor analysis used. 
Factor analysis utilizes matrix methods, like PC A and 
PMF. In the study of source apportionment of airborne 
particles, the measured chemical composition data from 
the collected samples form a matrix in terms of chemi- 
cal species and time samples which can be decomposed 
into two matrices representing sources contributions and 
sources profiles. Matrix methods have been a very ef- 
fective tool for source apportionment for collected sam- 
ples of fine particles as well as coarse particles; see e.g. 
[22j [131 [15] . However, chemical data can have more at- 
tributes, such as particle size distribution which is com- 
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mon in identification problems of air pollution sources 
[24 . Thus, a two-way analysis is not sufficient. Multi- 
way factor analysis (tensor decompositions) can provide 
better estimations. 

We examine the ambient particulate data [24] col- 
lected at Allen Park in Detroit, MI between February 
and April 2002. It follows the following three-way re- 
ceptor model: 

p 

where the measured data Xijk is the concentration 
value of the ith sample of the kth particle size range 
(0.1 - 0.34/im, 0.34 - 1.15/im, 1.15 - 2.5/im) of the jth 
species, a^p is the pth source mass contribution during 
the time units for the ith sample, bijk is the jth species 
mass fraction of the kth. particle size range from the pth 
source, eijk is the measurement error, and P is the total 
number of independent sources. 

In this paper, we reformulate the three-way re- 
ceptor model in a block tensor structure format and 
utilize a new regularized tensor algorithm found in 
[20 . Previous tensor applications in chemometrics 
or environmental data analysis have been limited to 
tensor CANDECOMP/PARAFAC (CP) decomposition 
[11 [TOj [21 [3l [m [1^ or tensor Tucker decomposition 
[25l \26\ |8]. Instead we focus on a more general type 
of tensor decomposition - Block Term Decomposition 
(BTD) m [7], and its application in environmental 
pollution. In fact, the data model fits the BTD in rank- 
(L,L,1). 

Here we apply the regularized alternating method 
(RALS) [20^ for the BTD problems. Then, the method 
is used to analyze a real-life air sample tensor dataset. 
We show the efficacy of the regularized alternating algo- 
rithm (BTD-RALS) for solving BTD through numerical 
examples as well as prove a convergence property of the 
algorithm. In the application part, we show that the air 
sample data model in [24 follows the BTD format, so 
that the BTD-RALS method is used to identify all the 
different factor pollution sources. 

The paper is organized as follows. We introduce the 
notation in Section 2. In Sections 3 and 4, we describe 
some background of the block term decomposition and 



then discuss a convergence property of the algorithm. 
The real data example is given in Section 6. Concluding 
remarks are discussed in Section 6. 

2 Preliminaries 

We denote the scalars in R with lower-case letters 
(a,6, ...) and the vectors with bold lower-case letters 
(a, b, ...). The matrices are written as bold upper- 
case letters (A,B,...) and the symbol for tensors are 
calligraphic letters {A, S, . . .). The subscripts represent 



the following scalars: 



(A)i 



(a)i = ai and is the rth column of A. The 
superscripts indicate the length of the vector or the size 
of the matrices. For example, is a vector with length 
K and B^^^ is a x K matrix. In addition, the lower- 
case superscripts on a matrix indicate the mode in which 
it has been matricized. For example, T^^) is the mode-n 

p/X JXK 



matricization of the tensor T G 



for 



1,2,3. 



Definition 2.1. The Kronecker product of matrices A 
and B is defined as 



aiiB 
a2iB 



ai2B 
a22B 



Definition 2.2. The Khatri-Rao product is the 
^'matching columnwise^' Kronecker product. Given 



matrices A G 



pIxK 



and B G 



pJxK 



their Khatri-Rao 



product is denoted A B. The result is a matrix of 
size {IJ X K) defined by 



A B = [Ai (g) Bi 



If a and b are vectors, then the Khatri-Rao and Kro- 
necker products are identical, i.e., a0b = a0b. 

Definition 2.3. Let A = [Ai ... A^^] and B = 
[Bi ... Bi^] he two partitioned matrices. Then we 
define a product of A and B^ denoted Qp, which is 



A 0p B = [Ai Bi 



Definition 2.4. (MoDE-n matricization) 
Matricization is the process of reordering the ele- 
ments of an Nth order tensor into a matrix. The 
mode-n matricization of a tensor T G M^ix^2x---x/Ar 
is denoted by T(n) ci'^d arranges the mode-n fibers, 
the vectors obtained from fixing every index with the 
exception of the nth mode, as the columns of the 
resulting matrix. 



the ith column ofM., is denoted by vec{M.) which is a 
vector of size mn defined by 



veciM) 



mi 



Definition 2.6. (Frobenius-norm) The Frobenius 
norm of a tensor X G x^2x---x7Ar square root 

of the sum of the squares of all its elements. The 
formula is 



\\x\ 



\ 



EE 

il=li2=l 



E 



3 Block Term Decompositions 

Let A' be a real- valued third-order tensor of size I x J x 
K. A decomposition of A' in a sum of rank-(L^,L^, 1) 
terms with 1 < r < is a decomposition of the form 



(3.1) 



O Cr 



p/x J 



IS 



in which the rank of the matrices G 
Lr and c^ are vectors of length K. Elementwise, the 
decomposition is defined as 



So X is decomposed into a sum of matrix-vector 
outer products. If we decomposes into two matrices, 
i.e., E^ = A^ • B^, where the matrix A^ G R^^^'^ and 



the matrix B^ G . 



JxLr- 



are rank L^, then the equation 



(3.1) can be written as the following formula. 



R 



(3.2) 



A' = ^(A^-B^)oc^. 



We define A = [Ai ••• A^^], B = [Bi ••• B], 
C = [ci • • • Ci^] and call them the factor matrices of 
the BTD-(L^,L^, 1), where A^ and B^ are submatrices 
with r = 1, . . . , . When Lr = L ior I < r < R, 
the decomposition is called BTD in rank-(L, L, 1). We 
will focus on this type of factorization and analyze the 
environmental dataset with BTD-(L, L, 1). 

See Figure [l] for the BTD in rank-(L,L,l) for a 
third-order tensor. 

So, for a given tensor X G W^""^ , the BTD- 
{L^L^ 1) will minimize the error function 



Definition 2.5. (Vectorization) The vectorization 
of a matrix M = [mi • • • m^] G W^^^ ^ where m^ is 



/(A,B,C) 



Ai 



4 Algorithm for Block Term Decomposition 
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Figure 1: BTD-(L,L,1) for X G 



p/x JxK 



The general case of the block term decomposition is 
called BTD in rank-(L, M, A^). It decomposes a tensor 
X e R^x^x^ in a sum of rank-(L, M, TV) terms of the 
form 

R 

(3.4) X = Xi X2 X3 Cr, 

r=l 

in which Vr e rLxMxn ^^^^^ ^ ^ixl^ 
Br e M-^^^^ and G R^^^ are fuh column rank, 
l<r <R. 

The product 'xi', 'X2' and 'X3' are called Tucker 
product which defines the multiplication between a 
tensor and a matrix. Figure |2] shows the BTD in rank- 
(L, M, N) for a third-order tensor. 
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Figure 2: BTD-(L,M,7V) for X e 



r)IxJxK 



Actually, each term of the equation (3.4) is a Tucker 
model [25 . We see here if let N = 1, then each core 
tensor becomes a matrix, thereby reducing the general 
model to BTD-(L^,L^, 1). If the ranks of the matrices 
for each term are same, then it becomes the BTD- 
(L,L,1). Furthermore, if L = M = N = 1, then the 
each core tensor is just a scalar and all the matrices A^, 
B^ and are vectors. So, the BTD-(1, 1, 1) is exactly 
a CP decomposition. 



(3.5) 



X = o o Cr. 



The BTD in rank-(L,L, 1) is a general case of the CP 
decomposition. Thus, the standard and regularized 
algorithms for solving for CP can be applied to the BTD 
case. More details on the algorithms are discussed in the 
next section. 



For the BTD-(L,L, 1), equation (3.2) can be expressed 
to three equivalent equations. If we take the three 



different modes matricization on equation (3.2), then 
we have 



X(i) = A(C©pBf, 
X(2) = B(C©pAf, 
X(3) = C[(Bi0Ai)li 



{BnQAn)lLf, 



where li, is a column vector of all ones of length L. 

The paper of De Lathauwer and Nion [7 proposes 
an algorithm to solve the block term decomposition 
in rank (L,L,1) for a third-order tensor, called the 
alternating least-squares for BTD (BTD-ALS). 

Given a third-order tensor X G R^^^^^, our 



problem is to minimize the function (3.3), i.e 



(4.6) min 

A,B,C 



A'-^(A^-B^)oc, 



with respect to the factor matrices A, B and C. 

From the three equations above, we obtain the 



following three expressions for (4.6): 



min ||X(i) - A(C Op Bnil., 



A,B,C 

min ||X 

A,B,C 

and 

min ||X 

A,B,C 



(2) 



(3) 



B(C 0^ A)^||^, 

C[(Bi Ai)U • • • (B^ A^)U]^||2,. 



These three are equivalent. Instead of solving (4.6) 
for the three factors one time, we can use these three 
equations by fixing all factor matrices but one factor 
each time. Then the problem reduces to three coupled 
linear least-squares subproblems. We have 



A^+i 



argmin ||X(i) - A(C^ 0^ B^)^|||, 
argmin ||X(2) - B(C^ 0^ A^^Y\\l, 



argmm 



:(3)-C[(B^+^0A^^)U 
•••(B^+^0A^+^)U 



where X(i) G R^^-^^, X(2) G R^><^^ and X(3) G R^^^-^ 
are the mode-1, mode-2 and mode-3 matricizations of 
tensor X. 

Given the initials A^, B^ and C^, then at the 
{k + l)th iteration, we hold B^ and to update the 
factor A to get A^+^, then A^+^ and are held to 



update B and obtain B^+^. Similarly, we hold A^+^ 
and B^+^ to obtain C^+^. Usually, the Frobenius 
norm of the error between the given tensor and the 
updated tensor is measured at each iteration to provide 
a stopping criterion. 

There are some disadvantages to alternating least- 
squares algorithm (see [16], [H]). This method is 
not guaranteed to converge to a global minimum or 



even a stationary point of the cost function (4.6), 



only to a solution where the objective function ceases 
to decrease. Another issue of this method is that 
sometimes it needs a high number of iterations to 
converge, creating a swamp. In order to remove the 
swamp, [20] introduced a regularization method with 
the regularization parameter A^. The new algorithm 
for BTD-(L,L, 1) is called BTD-RALS method. 

We add the regularization item for each subproblem 
in the above three equations. 



, n+l 



B 



/c+l 



argmin ||X(i) - A(C" 0^ B")^ ||^ 
+A„||A-A"|||„ 

argmin ||X(2) - B(C" 0^ A^+^f H^, 
+A„||B-B"||2„ 



argmin ||X(3) - C[(Bi"+^ Ai"+^)li, . . . , 

(Br"+i An"+')lLf\\% + A„||C - 

In alternating fashion, these three subproblems are 
solved for the factor matrices A, B and C. The 
regularization parameters A^, n = 0, 1, . . . , are given by 
a nonnegative decreasing sequence and at each iteration 
the parameters are the same for the three updated factor 
matrices. The rules for choosing the regularization 
parameters is also discussed in [JOj. The regularized 
algorithm is summarized in the following Table [l] The 
number of iterations N in the algorithm is set to a large 
number, and a stopping criterion can be used. 

Numerically, such regularized method is more effi- 
cient than the BTD-ALS in terms of reducing the num- 
ber of iterations and accelerating the convergence. 

Example. [Numerical example with a swamp] Here we 
give an example to show the swamp reducing technique 
of the BTD-RALS. 

We create a tensor X G Rio x 1^x28^ which satisfies a 
block term decomposition in rank-(3,3, 1) with R = 3. 
The factor matrices are A G M^°><^, B G M^^><^ and 
C G R^^^^, and the factorization equation is 



RBTD-(L,L,l)-Algorithm [20 
procedure RBTD-(L, L, R, TV, A^) 

give initial guess A^ G R^><^, B^ G R^><^, 



r- l\$KxR 



G 



for n = 1, . . . , TV do 

W ^ [(C^ Qp B^);AnI^^><^^] G 

^iJK+LR)xLR 

S ^ [X(i); A^(A^)^] G RiJK+LR)xI 

^ ^ solving least squares 

to update A 

W ^ [(C^ Qp A^+i);AnI^^>'^^] G 

^iIK^LR)xLR 

S ^ [X(2); A^(B^)^] G RiIK+LR)xJ 

^ % solving least squares 

to update B 

W ^ [((B^+i A5^+i)U,...,(BS+i 

A^+^)l^);AnI^x^] gR(^^+^)x^ 

S ^ [X(3); An(C^)^] G R(Wi?)xK 

Qn+i ^ 0^ solving least squares 

to update C 

end for 

return A^, B^, 
end procedure 



BT)oc. 



Table 1: Regularized algorithm of BTD-(L,L, 1) 
with rank R for a third-order tensor A' G R^^^^^ 



We use the same random initials for both BTD- 
ALS and BTD-RALS methods. Figure |3] shows that 
the BTD-RALS algorithm only takes 1558 iterations 
to reach an error within 10~^, however, the BTD- 
ALS algorithm does not decrease the error after 20,000 
iterations. 

4.1 Convergence Property of RALS The regu- 
larized method of ALS (RALS) [2Q^ for solving ten- 
sor CP decomposition and the convergence property 
of RALS have been studied in [19]. We have pointed 
out that the decomposition BTD-(I/,L,1) is a general 
case of CP decomposition. In this section, we will show 
that the BTD-RALS has the same framework with the 
RALS and thus, the BTD-RALS has the same conver- 
gence property. 



Evolution of the cost function 
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Figure 3: The comparison of the BTD and RBTD 
with the same initials 



We can view the BTD-(L,L, 1) problem as a non- 
linear optimization. Problem (4.6) has the following 
expression, 



(4.7) ^mm^/(A,B,C) 



I J K 

EEE 



mm 

A,B,C 



E 



L 



r=l ^1=1 



where A = [Ai---Ar], B = [Bi---Bi^] and C = 

[ci • • • cji] are the factor matrices. a^[^ denotes the il 
element {ith row and Ith column) of the matrix A^, 6^^^ 

(r) 

expresses the jl element of the matrix B^, and is 
the kth element in the vector c^. 

So the cost function can be seen as a function from 
y to R, where 

y = vec{[vec{A) vec{B) vec{C)]) G 

with n = {IL^ JL^ K)R. 

Let vec{A) = yi G M^^^, vec(B) = y2 G M^^^ and 
vec{C) = y3 G M^^. We partition the vector y G M'' 
into 3 component vectors y^ G i = 1,2,3, where 

ni = ILR^ n2 = JLR and ns = KR. It follows that 
y = yi X y2 X ya G X R^^ x R^3 = rj.^^^^ 

the BTD-(I/,L, 1) can be reformulated to the following 
problem. 



minimize 
subject to 



/(yi,y2,y3), 

y G M''^ X X 



Therefore, the BTD-RALS method for solving 
the BTD-(L,L, 1) updates each component of y in 



turn. Starting from a given initial point y^ = 
vec{[vec{A^) vec{B^) vec{C^)]) G M^, a sequence 
{(yi^yt^yf)} generated by the following equations 





= argmin 






= argmin 






= argmin 





,z,y^) + \k\\yl-z\\ 



y2^+\z) + Afc||y3^ 



Thus, the BTD-RALS algorithm has the same 
framework as the RALS for CP decomposition [19 . 
Moreover, the convergence property of RALS is applied 
to the BTD-RALS. 

For the BTD-RALS method, we have the following 
theorem. 

Theorem 4.1. Suppose that the sequence 
{(A^,B^,C^)} obtained from the ^TD-RALS has 
limit points. Then every points (A,B,C) is a critical 
point of the problem 



Recall that a sequence that has a limit point 
(A,B,C) means that there exists a subsequence of 
{(A^,B^,C^)} that converges to {(A^, B^, C^)}. The 
following is a critical point definition found in [1 . The 
critical point (A,B,C) of the problem (4.7) is a point 
such that 



V/(A,B,C)^ (^(A,B,C)-(A,B,C)J> 0, V(A,B,C). 

According to this theorem, we can see that for a 
non-degenerate BTD problem [16] , [19] , the limit points 
obtained from the BTD-RALS method are the critical 



point of the original cost function (3.3). 



5 Experiments 

In this section, the air pollution collected data [24 is 
analyzed in rank-(9,9, 1) BTD using the BTD-RALS 
algorithm. Several figures are then created from the 
numerical results which explain the sources' identifica- 
tion via the sources profiles and time series of the source 
contributions. We also provide some numerical simula- 
tions on randomly generated tensor noisy data. 

5.1 Environmental Data The original data was 
collected from 25 February to 10 April 2002 at Allen 
Park in Detroit, Michigan. The data was then sampled 
by using a three-stage Davis Rotating-drum Universal- 
size-cut Monitoring (DRUM) impactor [23,, [24,. The 
particles were collected in three size modes, 2.5/im > 
Dp > 1.15/im, 1.15/im > Dp > 0.34/im and 0.34/im > 
Dp > 0.1/im, where Dp denotes the particle diameter. 
The air sample was also analyzed for elements of higher 



atomic number. The 27 chemical species found were Na, 
Mg, Al, Si, P, S, CI, K, Ca, Ti, V, Cr, Mn, Fe, Co, Ni, 
Cu, Zn, Ga, As, Se, Br, Rb, Sr, Zr, Mo, and Pb. 

The data we study is considered as a function of 
size, time and chemical composition (a.k.a. elemental 
species). If we use i to denote chemical species, j to 
express particle size and k to be the time sample, then 
a data point Xijk can be expressed as the concentration 
value of the ith chemical species of the jth particle size 
of the kth time sample. 

See Figure |4] for the air sample tensor picture. 




316 



Medium 



Particle Size 



Large 



Time Sample 



where the Xijk is the element of the third-order tensor 
obtained from the previous section, i.e., the concentra- 
tion value of the ith chemical species of the jth particle 
size range of the kth time sample. In the elemental 
form, bkp is the pth source mass contribution during 
the time units for the kth sample, aijp is the ith species 
mass fraction of the jth particle size range from the 
pth source. Cijk is the residual associated with the ith 
species concentration measured in the kth sample of the 
jf'th size range, and P is the total number of independent 
sources. 

Since each entry Xijk denotes the concentration, a^jp 
is the species mass fraction and bkp is the source contri- 
bution, then we need add non-negativity constraints on 
aijp and bkp when the tensor decomposition is applied. 



We see that in the equation (5.8) 



is an element 



of the air sample tensor A'. For each fixed p, aijp is 
an entry of a matrix Ap and bkp is an element of a 
vector hp. Finally, the error tensor can be denoted as 
S. Therefore, the equation (5.8) is equivalent to the 
following form. 



(5.9) 



A" = Ap o bp + g. 



Since A* G 



p=i 



p27x3x316 



for each matrix Ap G 



it can be decomposed into two matrices Cp G 



^27x3^ 
p27xLp 



and B G 



so that Ar, 



rank of matrix Ap. 
written as follows, 



Therefore, the model also can be 



where Lp is the 



Figure 4: The sampled data are placed on the 
three planes to construct the air tensor A'. Par- 
ticle size denotes mode- J, Chemical composition 
denotes mode-/ and Time sample denotes mode- 
K. So, the element Xijk in A* is the concentration 
value of the ith chemical species of the jth par- 
ticle size range of the kth time sample. 

There are 27 chemical species and three different 
size of particle (small, medium, large), and according to 
[24] . the time sample are 316 (3 time samples at first 
day; 8 time points on each day from the second to the 
last second day; 1 time point in the last day). So, the 
air tensor is a third-order tensor of size 27 x 3 x 316. 

5.2 Model Description According to [24^, in order 
to separate the different factor sources from the dataset, 
the main equation is as follows: 



(5.10) X = ^{Cp--D'^)ohp + £. 

p=i 

Thus, our goal is to find the matrices C = 
[Ci • • • Cp], D = [Di • • • Dp] and B = [bj • • • bp] 
to minimize the following function. 



(5.11) Q(C,D,B) 



This is exactly the error function for BTD (3.3) in rank- 
{Lr^Lr^ 1). When Li = Lj, 1 < z, j < i?, it is the error 
function for BTD in rank-(L,L, 1). 

Since the BTD-(L^, L^, 1) matches the model for 
the air dataset, we can use the algorithm BTD-RALS 
to solve for the air sample tensor to minimize the above 



function (5.11). 



In 23], the cost function used is 



(5.8) 



ijp^kp H~ ^ijki 



(5.12) g 



I J K 

i = l j = l k=l 



p=l (^ijp^k 



^ijk 



where Uijk is an uncertainty estimate element in the 
ith species of the jth particle size of the kth time 
sample. The procedure of Polissar [22^ was used to 
assign measured data and the associated uncertainties 
as the input data. 



contribution of p source in terms of time. Therefore, 



Comparing the functions (5.12) (element- wise) and 



(5.11), the cost function in our method does not include 



the uncertainty estimates. We will use the cost function 
without the uncertainties Uijk and consider the function 



(5.12) as a constraint on our solution. 

According to the paper [24 , there are 9 sources. 
Thus, in the BTD model (|5.11|), we let P = 9. For the 



tensor A' G ]^27x3x3i6^ ^^le block term decomposition in 
rank-(L^, L^, 1) with 9 terms is not essentially unique 
(see ^ ). Recall that essentially uniqueness indicates 
the decomposition is unique up to permutation and 
scaling. Since the tensor data does not satisfy the 
uniqueness criteria, then we will have multiple solutions 
from the algorithm. We tested a large number of initial 
conditions with different Lp^ 1 < p < 9 and found that 
under the setting of = 9, p = 1, 2, . . . , 9, we can find 
a solution that is consistent with the numerical results 
found in [24]. 

For the non-negativity constraints on the decom- 
position, we need to use the BTD-RALS method to 
solve for the non-negative factor matrices in the prob- 
lem (5.12). So we will add non-negativity constraints 
on the three subproblems. In terms of solving these 
subproblems, for each least-squares problem with non- 
negativity constraints, we can use the method in [17], or 
we can use the non-negative matrix factorization (NMF) 
introduced by Lee and Seung [18] to solve the each sub- 
problem with constraint. 

In this paper, we use the algorithm in Table [l] with 
the method by [I7] to obtain the three non-negative 
factor matrices C, D and B in the cost function ( |5.11 ). 



5.3 Result and Discussion We apply the BTD- 



(9, 9, 1) on the air sample tensor A' (5.11). By using the 
BTD-RALS algorithm with non-negativity constraints, 
we can obtain the three factor matrices 



C= [Ci ■•■ Cg], D= [Di 



where C„ G 



p27x9 



Do 



p3x9 



[bi 



and B G 



p316x9 



p = 1,2,. ..,9. Therefore, according to the model for 
the air sample, for each p, Ap = 
matrix and the elements of such matrix are exactly the 



Dl e R2^x3 is a 



pS (for a fixed p) in the equation (5.8). Furthermore, 



each Ap denotes a source profile. So, for each Ap we 



have a bar plot for one source (see the Figure [5a|) . The 
vector bp in the factor matrix B is the vector bkp (for 
a fixed p) in the model equation (5.8). It denote the 



the Figure [5b| expresses the time series of the source 
contributions. 

From the source profile bar plot |5a| we can figure 
out the nine different factor sources, they are: Industrial 
(Fe+Zn), Sulfur with Dust, Road Dust, two types of 
Metal Works, Local Sulfate, Road Salt, Homogeneously 
formed Sulfate and Cloud Processed Sulfate. For the 
explanation for each source, refer to the paper [24]. In 
the Figure [5bl we can also see the change of each source 
contribution in time. For example, there are several 
spikes in the time series of the road source contribution. 
This indicates additional snowfall or low temperatures 
where the ice melting. Furthermore, we can also tell the 
contribution change of other factor sources. 

The identification from the bar plots in Figure 5a 
is also based the particle sizes. This method provides 
a more accurate result comparing the classical matrix 
factorization method. It is seen that in the industrial 
source, the high concentrations of Fe and Zn are in the 
middle size rage while the large size rage of Fe and Zn 
are shown in the metal works. 

We can also analyze the concentration during the 
weekdays and weekends for each factor source (see the 
Figure [6|. The left nine plots show the concentration 
change during the weekdays and the corresponding 
right plots show the concentration change during the 
weekends for each factor source. In each day, we take 
the maximum and minimum concentration at the time 
points 1, 4, 7, 10, 13, 16, 19, 22 and then take the 
average for each time point. Figure [6] shows the factors 
concentrations of Road Salt, Homogeneous Sulfate and 
the second type of Metal Work are high during the 
weekdays and lower during the weekends. This is very 
reasonable since the weekdays are typically work days 
when industrial companies are in operation. There is 
also a longer commuting hours during the weekdays 
than the weekends, explaining the higher concentration 
of Road Salt. For some factors like local sulfate, there is 
no difference between the weekdays and weekends which 
indicates the people's activities are not a major effect 
on these environmental factors. 

With the Figures [5al [5b| and [6] and above discussion, 
we show that the BTD-RALS method can successfully 
identify the nine different factor sources and also pro- 
vides the time contribution of each factor. 

5.4 BTD-RALS on the noisy data Here we give a 
numerical experiment to show the BTD-RALS method 
works on the third-order tensor data with noise. We 
generate tensors T G ]R^x^><'^ in the following way: 
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(a) Source profiles for the resolved factors. The y-axis denotes the (b) The time series of source contributions, 

relative elemental concentration. The order of the particle size is 
Large, Medium, Small from left to right for each element. 



Figure 5: Nine factor sources identified from the air sample. 
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Figure 6: Time of a day variations. The left column shows the concentration change in weekdays for 
each factor source. The right column shows the concentration change in weekends for each factor 
source 



where T has the block term decomposition in rank- 
(2, 2, 1) with = 3 in the equation (|3.2|) so that 



and 



p7xl 



1,2,3. 



The second term in (5.13) is the noise term and the 
parameter controls the noise level. The entries of 
A = [Ai A2 A3], B = [Bi B2 B3], C = [ci C2 C3] and 
the tensor M are drawn from a zero-mean unit-variance 
Gaussian distribution. 

By using BTD-RALS method, a Monte Carlo ex- 
periment of BTD-(2,2, 1) with = 3 on T consisting 
50 runs is carried out. The algorithm is initialized with 
three random starting values. _ 

We measure the relative error e = ||C — C||/||C||, 
where C is the approximation of C, optimally permuted 
and scaled. The median results are plotted in Figure [7| 
It is seen that with low noise levels, average error in C 
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Figure 7: Median relative error v.s. noise level 



increases proportionally to noise level. 
6 Conclusion 

The method BTD-RALS is presented in the application 
of identifying the factor sources of the collected air sam- 
ple. The dataset is formed into a third order tensor and 
the data model is written into a block term decompo- 
sition in rank-(L, L, 1). We apply a regularized alter- 
nating method to solve the block term decomposition 
and obtain the resulting factor matrices providing the 
source profiles and source contributions correctly. In 
addition, we show that regularized method is efficient 
than the classical alternating method numerically and 
can converge to a stationary point of the original BTD 
cost function under some assumption. The BTD-RALS 
algorithm is also tested on the random data with dif- 
ferent noise levels, which shows the average error in the 
third factor matrices C increases proportionally to noise 



level. 
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